Week 10 of 12 · Part C — Governance

The Safety Case

Locking in Week 10 — assembling the framework, card, and register into one defensible argument

Day 50 ~50 minutes Review

Day 50 of 60

What you now hold

This week you stepped out of testing and into governance. You can name the NIST AI RMF's four functions and slot your Part A work into them; you can write an honest model card that leads with limitations; you built a scored, owned risk register; and you understand how labs gate safeguards on measured capability. Each of those is a document — and a document, unlike a feeling, can be reviewed, challenged, and signed.

The through-line of Week 10

Governance is the discipline of making safety auditable. A framework gives the process, a model card gives honest reporting, a risk register gives accountability, and a scaling policy gives pre-committed gates. Together they form a safety case: a structured, evidence-backed argument that a system is safe enough to deploy — and the evidence is the artifacts you built.

The safety case as the unifying idea

Everything in Part C points at one artifact: the safety case. It's not a new document so much as the argument that ties the others together — "here is why we believe this system is safe enough to deploy, and here is the evidence." Each piece you built this week is a load-bearing claim in that argument.

The Structure of a Safety Case

1 · The claim — "safe enough to deploy, for this use"

Bounded by intended use and non-use, straight from your model card. A safety case is never "safe" in the abstract; it's safe for a specified deployment.

2 · The evidence — your measurements

Evals, red-team results, robustness reports, disaggregated performance. This is Parts A and B, marshaled as proof. A claim with no evidence is an opinion.

3 · The residual risk — what we accept, and who owns it

Your risk register, scored and mitigated, with the residual risk after mitigation made explicit. An honest safety case states what it's not claiming to have eliminated.

4 · The gates — what would change the answer

The capability thresholds and tripwire evals from your scaling-policy work: the conditions under which "yes" becomes "no." A safety case that can't be falsified by a future eval isn't a case — it's a hope.

Why this is the governance layer in miniature

A safety review board doesn't want your enthusiasm; it wants a case it can interrogate. Assemble your framework mapping, model card, and risk register as one argument and you've produced exactly the thing a sign-off rests on. That assembly — claim, evidence, residual risk, gates — is the whole job of Part C compressed into a page.

Self-quiz — can you do these without notes?

Prove the Week

~50 minutes

  1. Name the four NIST AI RMF functions and say which of your Part A artifacts feeds each. Re-skim the AI RMF core if you stall.
  2. List the sections of a model card from memory, and explain why honest limitation-reporting is itself a safety property.
  3. Explain capability-tier-gated safeguards (RSP / Preparedness) and what triggers a higher tier — then state why pre-commitment is the point.
  4. Take your risk register, cross-check it against the MIT AI Risk Repository domains, and confirm each top risk has a mitigation and an owner.
  5. Write the one-paragraph safety case your artifacts support — claim, evidence, residual risk, gates — and the go/no-go recommendation it implies.
The expert move

A practitioner hands over a stack of test results and lets the reviewer assemble the argument. An expert hands over the argument itself — a safety case where every claim is backed by an artifact and the residual risk and gates are stated plainly. The altitude jump is from producing evidence to making a defensible, falsifiable case for deployment: the document a review board can actually sign, because it told them the truth about what it doesn't claim.

Say this in an interview: "I don't just run evals — I assemble a safety case: a bounded claim about a specific deployment, the evidence behind it, the residual risk we're accepting and who owns it, and the gates that would flip the answer to no. That's the document a review board signs, and every line of it points back to an artifact I built."

Week 10 Takeaways